Tight Bounds for Data Stream Algorithms and Communication Problems

نویسندگان

  • Mert Sağlam
  • Ramesh Krishnamurti
چکیده

In this thesis, we give efficient algorithms and near-tight lower bounds for the following problems in the streaming model. Improving on the works of Monemizadeh and Woodruff from SODA’10 and Andoni, Krauthgamer and Onak from FOCS’11, we give Lp-samplers requiring O( −p log n) space for p ∈ (1, 2). Our algorithm also works for p ∈ [0, 1], taking Õ( −1 log n) space. As an application of our sampler, we give an O(log n) space algorithm for finding duplicates in data streams, improving the algorithms of Gopalan and Radhakrishnan from SODA’09. Given a stream that consists of a pattern of length m and a text of length n, the pattern matching problem is to output all occurrences of the pattern. Improving on the results of Porat and Porat from FOCS’09, we give a O(log n logm) space algorithm that works entirely in the streaming model. Finally we show several near-tight lower bounds for the above problems through new results in communication complexity.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tight Lower Bounds for Multi-pass Stream Computation Via Pass Elimination

There is a natural relationship between lower bounds in the multi-pass stream model and lower bounds in multi-round communication. However, this connection is less understood than the connection between single-pass streams and one-way communication. In this paper, we consider data-stream problems for which reductions from natural multi-round communication problems do not yield tight bounds or d...

متن کامل

Tight Bounds for Graph Problems in Insertion Streams

Despite the large amount of work on solving graph problems in the data stream model, there do not exist tight space bounds for almost any of them, even in a stream with only edge insertions. For example, for testing connectivity, the upper bound is O(n logn) bits, while the lower bound is only Ω(n) bits. We remedy this situation by providing the first tight Ω(n logn) space lower bounds for rand...

متن کامل

Approximating the Longest Increasing Sequence and Distance from Sortedness in a Data Stream

We revisit the well-studied problem of estimating the sortedness of a data stream. We study the complementary problems of estimating the edit distance from sortedness (Ulam distance) and estimating the length of the longest increasing sequence (LIS). We present the first sub-linear space algorithms for these problems in the data stream model. • We give a O(log n) space, one-pass randomized algo...

متن کامل

Adapting Parallel Algorithms to the W-Stream Model, with Applications to Graph Problems

In this paper we show how parallel algorithms can be turned into efficient streaming algorithms for several classical combinatorial problems in the W-Streammodel. In this model, at each pass one input stream is read, one output stream is written, and data items have to be processed using limited space; streams are pipelined in such a way that the output stream produced at pass i is given as inp...

متن کامل

Efficient and private distance approximation in the communication and streaming models

This thesis studies distance approximation in two closely related models the streaming model and the two-party communication model. In the streaming model, a massive data stream is presented in an arbitrary order to a randomized algorithm that tries to approximate certain statistics of the data with only a few (usually one) passes over the data. For instance, the data may be a flow of packets o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016